<html>
<head><meta charset="utf-8"><title>prometheus for crates.io · t-infra · Zulip Chat Archive</title></head>
<h2>Stream: <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/index.html">t-infra</a></h2>
<h3>Topic: <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html">prometheus for crates.io</a></h3>

<hr>

<base href="https://rust-lang.zulipchat.com">

<head><link href="https://rust-lang.github.io/zulip_archive/style.css" rel="stylesheet"></head>

<a name="232442482"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232442482" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232442482">(Mar 30 2021 at 15:11)</a>:</h4>
<p>wrote down the possible options I see for using prometheus with heroku</p>



<a name="232442485"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232442485" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232442485">(Mar 30 2021 at 15:11)</a>:</h4>
<p><a href="https://paper.dropbox.com/doc/crates.io-monitoring--BH7xt2hydhlC9nqMEJaovXGGAg-JWf5AxfJ1Nbc3lLuNcaTy">https://paper.dropbox.com/doc/crates.io-monitoring--BH7xt2hydhlC9nqMEJaovXGGAg-JWf5AxfJ1Nbc3lLuNcaTy</a></p>



<a name="232442595"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232442595" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232442595">(Mar 30 2021 at 15:12)</a>:</h4>
<p>(I still think running prometheus on each dyno is the least impactful change)</p>



<a name="232442601"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232442601" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232442601">(Mar 30 2021 at 15:12)</a>:</h4>
<p><span class="user-mention" data-user-id="116122">@simulacrum</span> <span class="user-mention" data-user-id="117568">@Aidan Hobson Sayers</span> ^</p>



<a name="232451043"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232451043" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232451043">(Mar 30 2021 at 16:02)</a>:</h4>
<p>I will try to get to reading and providing feedback today.</p>



<a name="232451746"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232451746" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> nagisa <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232451746">(Mar 30 2021 at 16:06)</a>:</h4>
<p>FWIW I had a ton of success having primary application merge metrics from a number of different sources in its <code>/metrics</code> endpoint.</p>



<a name="232451838"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232451838" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> nagisa <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232451838">(Mar 30 2021 at 16:06)</a>:</h4>
<p>this also has the benefit of ensuring that the metrics in the prometheus database are more or less aligned for the same <code>instance</code> label.</p>



<a name="232452136"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232452136" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> nagisa <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232452136">(Mar 30 2021 at 16:08)</a>:</h4>
<p>But I guess that doesn't help if the application doesn't know the nodes it runs on or how to merge the metrics.</p>



<a name="232456166"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232456166" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232456166">(Mar 30 2021 at 16:33)</a>:</h4>
<p>yeah, the whole problem is that heroku doesn't expose the underlying dynos in any way</p>



<a name="232456244"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232456244" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232456244">(Mar 30 2021 at 16:34)</a>:</h4>
<p>which is really unfortunate and prevents clean solutions</p>



<a name="232470934"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232470934" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> nagisa <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232470934">(Mar 30 2021 at 18:07)</a>:</h4>
<p>Is having a small custom server at a known IP address an option? The application could establish a persistent TCP connection to such a server when it starts. The server then could retrieve metrics from the applications through that connection. Kinda pushgateway but not quite.</p>



<a name="232483850"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232483850" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232483850">(Mar 30 2021 at 19:31)</a>:</h4>
<p>yes, but we'd have to develop and maintain that server</p>



<a name="232483985"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/232483985" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#232483985">(Mar 30 2021 at 19:32)</a>:</h4>
<p>just running a prometheus server in every dyno configured to write metrics to our main server requires way less development effort</p>



<a name="233346764"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233346764" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233346764">(Apr 06 2021 at 16:07)</a>:</h4>
<p><span class="user-mention" data-user-id="121055">@Pietro Albini</span> to follow up on this - sorry it's been a bit - I'd like to better understand how big a priority the per-dyno metrics are. I'm correct that we can expose the service metrics from <a href="http://crates.io">crates.io</a> anytime, right?</p>



<a name="233346892"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233346892" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233346892">(Apr 06 2021 at 16:08)</a>:</h4>
<p>How common are per-dyno problems that wouldn't be visible in the service metrics?</p>



<a name="233347061"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233347061" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233347061">(Apr 06 2021 at 16:09)</a>:</h4>
<p>we can indeed expose service metrics for <a href="http://crates.io">crates.io</a> at anytime, yes</p>



<a name="233347131"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233347131" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233347131">(Apr 06 2021 at 16:10)</a>:</h4>
<p>but most of the metrics I would find useful right now are at the dyno level though</p>



<a name="233347258"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233347258" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233347258">(Apr 06 2021 at 16:10)</a>:</h4>
<p>could you expand on that in the doc / clarify that prioritization? It might be helpful to know what kind of metrics are being considered</p>



<a name="233347354"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233347354" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233347354">(Apr 06 2021 at 16:11)</a>:</h4>
<p>I'll do that in a bit!</p>



<a name="233370089"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233370089" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233370089">(Apr 06 2021 at 18:53)</a>:</h4>
<p><span class="user-mention" data-user-id="116122">@simulacrum</span> I put the metrics I need in the document, and asked the rest of the <a href="http://crates.io">crates.io</a> team to do the same for the ones they need that I missed</p>



<a name="233370121"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233370121" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233370121">(Apr 06 2021 at 18:53)</a>:</h4>
<p>most of the useful ones are in the "instance/dyno" section</p>



<a name="233370145"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233370145" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233370145">(Apr 06 2021 at 18:53)</a>:</h4>
<p>I'll check your comments tomorrow</p>



<a name="233371413"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/233371413" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#233371413">(Apr 06 2021 at 19:02)</a>:</h4>
<p>thanks!</p>



<a name="234167236"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/234167236" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#234167236">(Apr 12 2021 at 14:21)</a>:</h4>
<p>finally found the time to reply to the comments!</p>



<a name="235016142"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235016142" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235016142">(Apr 17 2021 at 20:43)</a>:</h4>
<p>opened a PR on the <a href="http://crates.io">crates.io</a> side to serve metrics! <a href="https://github.com/rust-lang/crates.io/pull/3531">https://github.com/rust-lang/crates.io/pull/3531</a></p>



<a name="235016160"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235016160" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235016160">(Apr 17 2021 at 20:43)</a>:</h4>
<p>it implements both service-level and instance-level metrics, even though right now we can collect only service-level ones</p>



<a name="235016170"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235016170" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235016170">(Apr 17 2021 at 20:43)</a>:</h4>
<p><span class="user-mention silent" data-user-id="116122">simulacrum</span> did you investigate instance-level metrics more?</p>



<a name="235016183"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235016183" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235016183">(Apr 17 2021 at 20:43)</a>:</h4>
<p>(I know you've been busy, it's totally fine if you didn't manage to make progress <span aria-label="smile" class="emoji emoji-1f642" role="img" title="smile">:smile:</span>)</p>



<a name="235019664"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235019664" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235019664">(Apr 17 2021 at 21:17)</a>:</h4>
<p>you mean your comments -- no, not yet</p>



<a name="235019667"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235019667" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235019667">(Apr 17 2021 at 21:17)</a>:</h4>
<p>have we completed investigation on upgrading prometheus?</p>



<a name="235020577"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235020577" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235020577">(Apr 17 2021 at 21:32)</a>:</h4>
<p><span class="user-mention" data-user-id="121055">@Pietro Albini</span> one additional question I had -- is this setup going to be similar to what we end up with on other services (e.g., <a href="http://docs.rs">docs.rs</a>) if we end up putting those behind e.g. ELB after moving parts of them to ECS?</p>



<a name="235020613"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235020613" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235020613">(Apr 17 2021 at 21:33)</a>:</h4>
<p><span class="user-mention silent" data-user-id="116122">simulacrum</span> <a href="#narrow/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio/near/235019667">said</a>:</p>
<blockquote>
<p>have we completed investigation on upgrading prometheus?</p>
</blockquote>
<p>also no</p>



<a name="235020617"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235020617" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235020617">(Apr 17 2021 at 21:33)</a>:</h4>
<p>(and e.g. if we had 2 triagebot containers running instead of one, would we need this there too?)</p>



<a name="235020651"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235020651" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235020651">(Apr 17 2021 at 21:34)</a>:</h4>
<p><span class="user-mention silent" data-user-id="116122">simulacrum</span> <a href="#narrow/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio/near/235020577">said</a>:</p>
<blockquote>
<p><span class="user-mention silent" data-user-id="121055">Pietro Albini</span> one additional question I had -- is this setup going to be similar to what we end up with on other services (e.g., <a href="http://docs.rs">docs.rs</a>) if we end up putting those behind e.g. ELB after moving parts of them to ECS?</p>
</blockquote>
<p>no, we can configure prometheus to periodically fetch the private IPs of the fargate instances from the AWS API and scrape the containers directly</p>



<a name="235020712"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235020712" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235020712">(Apr 17 2021 at 21:34)</a>:</h4>
<p>ah, so the only reason heroku is 'special' is that we have <em>no</em> access to the private IPs?</p>



<a name="235020715"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235020715" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235020715">(Apr 17 2021 at 21:34)</a>:</h4>
<p>yep</p>



<a name="235020724"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235020724" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235020724">(Apr 17 2021 at 21:34)</a>:</h4>
<p>and no way to connect to them even if we somehow figure them</p>



<a name="235020768"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235020768" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235020768">(Apr 17 2021 at 21:35)</a>:</h4>
<p>sure, yeah</p>



<a name="235021472"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021472" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021472">(Apr 17 2021 at 21:48)</a>:</h4>
<p>hm, so I'm trying to figure out how heroku's language runtime metrics work - they don't have them for rust yet, but for a bunch of other languages</p>



<a name="235021476"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021476" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021476">(Apr 17 2021 at 21:48)</a>:</h4>
<p>my impression based on screenshots and the docs I can find is they're still service-level</p>



<a name="235021500"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021500" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021500">(Apr 17 2021 at 21:49)</a>:</h4>
<p>they also use a prometheus agent though, according to this <a href="https://blog.heroku.com/heroku-exec-language-runtime-metrics-ga-runtime-debugging">https://blog.heroku.com/heroku-exec-language-runtime-metrics-ga-runtime-debugging</a></p>



<a name="235021785"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021785" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021785">(Apr 17 2021 at 21:55)</a>:</h4>
<p>so, heroku runs this to collect language metrics</p>



<a name="235021790"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021790" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021790">(Apr 17 2021 at 21:55)</a>:</h4>
<p><a href="https://github.com/heroku/heroku-buildpack-metrics/blob/master/.profile.d/heroku-metrics-daemon.sh">https://github.com/heroku/heroku-buildpack-metrics/blob/master/.profile.d/heroku-metrics-daemon.sh</a></p>



<a name="235021793"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021793" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021793">(Apr 17 2021 at 21:55)</a>:</h4>
<p><a href="https://devcenter.heroku.com/articles/log-runtime-metrics">https://devcenter.heroku.com/articles/log-runtime-metrics</a> seems to be a beta feature/experiment in exposing per-dyno metrics via logs</p>



<a name="235021816"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021816" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021816">(Apr 17 2021 at 21:56)</a>:</h4>
<p><a href="https://github.com/heroku/agentmon">https://github.com/heroku/agentmon</a></p>



<a name="235021826"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021826" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021826">(Apr 17 2021 at 21:56)</a>:</h4>
<p>yep</p>



<a name="235021857"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021857" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021857">(Apr 17 2021 at 21:56)</a>:</h4>
<p><a href="https://github.com/heroku/agentmon/blob/master/doc/design.md">https://github.com/heroku/agentmon/blob/master/doc/design.md</a></p>



<a name="235021890"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021890" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021890">(Apr 17 2021 at 21:57)</a>:</h4>
<p>my understanding is that the binary fetches the language-specific metrics with statsd or prometheus, collects them and sends them to the proprietary  heroku backend</p>



<a name="235021942"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021942" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021942">(Apr 17 2021 at 21:58)</a>:</h4>
<p>but that's limited to language specific metrics (like gc stats), not application specific metrics</p>



<a name="235021969"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021969" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021969">(Apr 17 2021 at 21:58)</a>:</h4>
<p>well, I mean, it's probably generic?</p>



<a name="235021975"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021975" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021975">(Apr 17 2021 at 21:58)</a>:</h4>
<p>but regardless, the thing I think aggregates across dynos</p>



<a name="235021991"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235021991" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235021991">(Apr 17 2021 at 21:59)</a>:</h4>
<p><span class="user-mention silent" data-user-id="116122">simulacrum</span> <a href="#narrow/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio/near/235021969">said</a>:</p>
<blockquote>
<p>well, I mean, it's probably generic?</p>
</blockquote>
<p>dunno, but in any case either that or the heroku backend will discard other metrics</p>



<a name="235022177"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235022177" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235022177">(Apr 17 2021 at 22:01)</a>:</h4>
<p>do we have some read on whether <a href="https://devcenter.heroku.com/articles/production-check#app-monitoring">https://devcenter.heroku.com/articles/production-check#app-monitoring</a> is also service-level?</p>



<a name="235022183"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235022183" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235022183">(Apr 17 2021 at 22:01)</a>:</h4>
<p>e.g., if I told you to just use new relic, could you?</p>



<a name="235022266"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235022266" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235022266">(Apr 17 2021 at 22:02)</a>:</h4>
<p>it basically seems like heroku expects people to do this via logs - <a href="https://devcenter.heroku.com/articles/production-check#app-monitoring">https://devcenter.heroku.com/articles/production-check#app-monitoring</a></p>



<a name="235022335"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235022335" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235022335">(Apr 17 2021 at 22:03)</a>:</h4>
<p>I did think that one alternative to running prometheus is to run basically curl localhost/metrics &gt; file &amp;&amp; aws s3 cp file s3:// and then scrape that</p>



<a name="235022388"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235022388" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235022388">(Apr 17 2021 at 22:04)</a>:</h4>
<p>but it doesn't seem obviously better, though in some sense has significantly less moving parts</p>



<a name="235023047"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235023047" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235023047">(Apr 17 2021 at 22:16)</a>:</h4>
<p><span class="user-mention silent" data-user-id="116122">simulacrum</span> <a href="#narrow/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio/near/235022177">said</a>:</p>
<blockquote>
<p>do we have some read on whether <a href="https://devcenter.heroku.com/articles/production-check#app-monitoring">https://devcenter.heroku.com/articles/production-check#app-monitoring</a> is also service-level?</p>
</blockquote>
<p>so, newrelic works at the instance-level, but that's because newrelic requires you to embed their collector inside your application, which then POSTs the metrics to the newrelic backend</p>



<a name="235023099"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235023099" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235023099">(Apr 17 2021 at 22:17)</a>:</h4>
<p>in our case the embedded collector would be the prometheus binary doing a remote write</p>



<a name="235023148"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235023148" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235023148">(Apr 17 2021 at 22:18)</a>:</h4>
<p><span class="user-mention silent" data-user-id="116122">simulacrum</span> <a href="#narrow/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio/near/235022266">said</a>:</p>
<blockquote>
<p>it basically seems like heroku expects people to do this via logs - <a href="https://devcenter.heroku.com/articles/production-check#app-monitoring">https://devcenter.heroku.com/articles/production-check#app-monitoring</a></p>
</blockquote>
<p>yep, that's how they implement a lot of their current metrics too</p>



<a name="235023349"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235023349" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235023349">(Apr 17 2021 at 22:21)</a>:</h4>
<p><span class="user-mention silent" data-user-id="121055">Pietro Albini</span> <a href="#narrow/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio/near/235023047">said</a>:</p>
<blockquote>
<p><span class="user-mention silent" data-user-id="116122">simulacrum</span> <a href="#narrow/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio/near/235022177">said</a>:</p>
<blockquote>
<p>do we have some read on whether <a href="https://devcenter.heroku.com/articles/production-check#app-monitoring">https://devcenter.heroku.com/articles/production-check#app-monitoring</a> is also service-level?</p>
</blockquote>
<p>so, newrelic works at the instance-level, but that's because newrelic requires you to embed their collector inside your application, which then POSTs the metrics to the newrelic backend</p>
</blockquote>
<p>ah, so this is basically the same as what we're doing</p>



<a name="235023350"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235023350" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235023350">(Apr 17 2021 at 22:21)</a>:</h4>
<p>yeah</p>



<a name="235025756"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235025756" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235025756">(Apr 17 2021 at 23:06)</a>:</h4>
<p><span class="user-mention silent" data-user-id="116122">simulacrum</span> <a href="#narrow/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio/near/235020577">said</a>:</p>
<blockquote>
<p><span class="user-mention silent" data-user-id="121055">Pietro Albini</span> one additional question I had -- is this setup going to be similar to what we end up with on other services (e.g., <a href="http://docs.rs">docs.rs</a>) if we end up putting those behind e.g. ELB after moving parts of them to ECS?</p>
</blockquote>
<p>I looked a bit more into this, and it turns out it's even easier than I thought</p>



<a name="235025780"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235025780" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235025780">(Apr 17 2021 at 23:06)</a>:</h4>
<p>we just need to configure ecs to also create "service discovery" dns records into a private dns zone</p>



<a name="235025789"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235025789" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235025789">(Apr 17 2021 at 23:07)</a>:</h4>
<p>(basically a bunch of A records automatically managed by AWS)</p>



<a name="235025801"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235025801" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235025801">(Apr 17 2021 at 23:07)</a>:</h4>
<p>and tell prometheus to scrape all the records in that zone</p>



<a name="235025810"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235025810" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235025810">(Apr 17 2021 at 23:07)</a>:</h4>
<p>so for the stuff we have in ECS it's fairly easy</p>



<a name="235025922"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235025922" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235025922">(Apr 17 2021 at 23:09)</a>:</h4>
<p>good to know</p>



<a name="235844218"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/235844218" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#235844218">(Apr 23 2021 at 13:56)</a>:</h4>
<p>granted the <a href="http://crates.io">crates.io</a> team access to grafana and started scraping service-level metrics</p>



<a name="238183052"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/238183052" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#238183052">(May 10 2021 at 17:44)</a>:</h4>
<p>so, I looked into this briefly a bit more</p>



<a name="238183074"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/238183074" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#238183074">(May 10 2021 at 17:45)</a>:</h4>
<p><a href="https://www.reddit.com/r/grafana/comments/mjdz67/monitoring_heroku_dynos_addons_like_postgres/">https://www.reddit.com/r/grafana/comments/mjdz67/monitoring_heroku_dynos_addons_like_postgres/</a></p>



<a name="238183163"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/238183163" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#238183163">(May 10 2021 at 17:45)</a>:</h4>
<p>so, it seems that there are a couple off-the-shelf "run a regex over this log stream and export that as metrics"</p>



<a name="238183180"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/238183180" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#238183180">(May 10 2021 at 17:45)</a>:</h4>
<p><a href="https://github.com/carlpett/stream_exporter">https://github.com/carlpett/stream_exporter</a><br>
<a href="https://github.com/fstab/grok_exporter">https://github.com/fstab/grok_exporter</a></p>



<a name="238183963"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/238183963" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#238183963">(May 10 2021 at 17:50)</a>:</h4>
<p>yeah can't really find much else for integrating prometheus and heroku postgres</p>



<a name="238703370"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/238703370" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> simulacrum <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#238703370">(May 14 2021 at 01:03)</a>:</h4>
<p>Had an idea while reading <a href="https://fly.io/blog/measuring-fly/">https://fly.io/blog/measuring-fly/</a> which might be a bit wild, but we could in theory have the heroku machines setup reverse tunnels or something on startup (even as easy as ssh) to the monitoring server so they could get scraped. Doesn't help with the postgres monitoring unfortunately, though.</p>



<a name="238736049"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/238736049" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#238736049">(May 14 2021 at 08:43)</a>:</h4>
<p>that seems... fragile <span aria-label="sweat smile" class="emoji emoji-1f605" role="img" title="sweat smile">:sweat_smile:</span></p>



<a name="238736157"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/238736157" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#238736157">(May 14 2021 at 08:44)</a>:</h4>
<p>in theory it's a cool idea, in practice I think either running prometheus on the dynos or parsing metrics from logs would be more reliable</p>



<a name="240836441"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/240836441" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#240836441">(May 31 2021 at 13:55)</a>:</h4>
<p>ok I think with <a href="https://vector.dev">https://vector.dev</a> we can get something off-the-shelf that's not a pain to configure</p>



<a name="240836476"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/240836476" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#240836476">(May 31 2021 at 13:56)</a>:</h4>
<p>playing around with it right now, I should hopefully have something ready by the meeting</p>



<a name="240915325"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/240915325" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#240915325">(Jun 01 2021 at 09:21)</a>:</h4>
<p><span aria-label="tada" class="emoji emoji-1f389" role="img" title="tada">:tada:</span> <a href="https://crates-io-heroku-metrics.infra.rust-lang.org/health">https://crates-io-heroku-metrics.infra.rust-lang.org/health</a></p>



<a name="240922836"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/240922836" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#240922836">(Jun 01 2021 at 10:39)</a>:</h4>
<p><span aria-label="tada" class="emoji emoji-1f389" role="img" title="tada">:tada:</span> <span aria-label="tada" class="emoji emoji-1f389" role="img" title="tada">:tada:</span> <span aria-label="tada" class="emoji emoji-1f389" role="img" title="tada">:tada:</span> <a href="https://grafana.rust-lang.org/d/IDdGv46Mk/heroku-postgres">https://grafana.rust-lang.org/d/IDdGv46Mk/heroku-postgres</a> (team-only)</p>



<a name="241174157"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/241174157" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#241174157">(Jun 02 2021 at 16:29)</a>:</h4>
<p>and opened PRs for supporting instance-level metrics <span aria-label="tada" class="emoji emoji-1f389" role="img" title="tada">:tada:</span> <span aria-label="tada" class="emoji emoji-1f389" role="img" title="tada">:tada:</span> <span aria-label="tada" class="emoji emoji-1f389" role="img" title="tada">:tada:</span></p>



<a name="241174170"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/241174170" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#241174170">(Jun 02 2021 at 16:29)</a>:</h4>
<p><a href="https://github.com/rust-lang/crates.io/pull/3674">https://github.com/rust-lang/crates.io/pull/3674</a></p>



<a name="241174174"></a>
<h4><a href="https://rust-lang.zulipchat.com#narrow/stream/242791-t-infra/topic/prometheus%20for%20crates.io/near/241174174" class="zl"><img src="https://rust-lang.github.io/zulip_archive/assets/img/zulip.svg" alt="view this post on Zulip" style="width:20px;height:20px;"></a> Pietro Albini <a href="https://rust-lang.github.io/zulip_archive/stream/242791-t-infra/topic/prometheus.20for.20crates.2Eio.html#241174174">(Jun 02 2021 at 16:29)</a>:</h4>
<p><a href="https://github.com/rust-lang/crates-io-heroku-metrics/pull/1">https://github.com/rust-lang/crates-io-heroku-metrics/pull/1</a></p>



<hr><p>Last updated: Aug 07 2021 at 22:04 UTC</p>
</html>